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METHODS AND SYSTEMS FOR IMPROVING A SEARCH RANKING USING 

POPULATION INFORMATION 

FIELD OF THE INVENTION 

[0001] The invention generally relates to search engines. More particularly, the 

invention relates to methods and systems for improving a search ranking using population 

information. 

BACKGROUND OF THE INVENTION 
[0002] Conventional search engines operating in a networked computer environment 
such as the World Wide Web or in an individual computer can provide search results in 
response to entry of a user's search query. In many instances, the search results are 
ranked in accordance with the search engine's scoring or ranking system or method. For 
example, conventional search engines score or rank documents of a search result for a 
particular query by the number of times a keyword or particular word or phrase appears 
in each document in the search results. Documents include, for example, web pages of 
various formats, such as HTML, XML, XHTML; Portable Document Format (PDF) files; 
and word processor and application program document files. Other search engines base 
scoring or ranking results on more than the content of the document. For example, one 
known method, described in an article entitled "The Anatomy of a Large-Scale 
Hypertextual Search Engine," by Sergey Brin and Lawrence Page, assigns a degree of 
importance to a document, such as a web page, based on the link structure of the web 
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page. Other conventional methods involve selling a higher score or rank in search results 
for a particular query to third parties that want to attract users or customers to their 
websites. 

[0003] In some instances, a user in a particular location may enter a search query in a 
search engine to obtain search results relevant to the user. For example, a user in Japan 
may enter a search query to obtain search results that include Japanese language websites. 
In response to such queries, conventional search engines can return unreliable search 
results since there is relatively little data to rank or score search results according to the 
user's location that are relevant or useful to the user for the search query. 

[0004 J Conventional search engines can determine location information associated 
with a user from the type of web browser application used to access the search engine. 
For example, when a user downloads a web browser application from the Internet, the 
user may have the option to download a particular version of the application depending 
upon the user's preferred language, e.g. Japanese or French versions. When a user uses 
the French version of a web browser application to access a search engine via the 
Internet, the search engine can often determine that the user is likely located in France 
merely by detecting use of the French version of the web browser application. 

[0005] Other conventional search engines obtain location information by the country 
domain suffix a particular user used in a search query. For example, a Japanese user 
requesting the Japanese version of a search engine may input the web address for the 
search engine with the country domain suffix of "co.jp" instead of the domain name 
suffix ".com." Based on such input, a search engine could determine that the user is 
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likely located in Japan. 

[0006| If a search engine returns more than one search result in response to a search 
query, the search results may be displayed as a list of links to the documents associated 
with the search results. A user may browse and visit a website associated with one or 
more of the search results to evaluate whether the website is relevant to the user's search 
query. For example, a user may manipulate a mouse or another input device and "click" 
on a link to a particular search result to view a website associated with the search result. 
In many instances, the user will browse and visit several websites provided in the search 
result, clicking on links associated with each of the several websites to access various 
websites associated with the search results before locating useful or relevant information 
to address the user's search query. 

[0007] Clicking on multiple links to multiple websites associated with a single set of 
search results can be time consuming. It is desireable to improve the ranking algorithm 
used by search engines and to therefore provide users with better search results. 

SUMMARY 

[0008] Embodiments of the present invention comprise systems and methods that 
improve search rankings for a search query by using population information associated 
with the search query are described. One aspect of the present invention comprises 
receiving a search query, and determining a population associated with the search query. 
Such populations may be defined and determined in a variety of ways. Another aspect of 
an embodiment of the present invention comprises determining an article (such as a 
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webpage) associated with the search query, and determining a ranking score for the 
article based at least in part on data associated with the population. A variety of 
algorithms using population information may be applied in such systems and methods. 

BRIEF DESCRIPTION OF THE DRAWINGS 
[0009] These and other features, aspects, and advantages of the present invention are 
better understood when the following Detailed Description is read with reference to the 
accompanying drawings, wherein: 

[0010] FIG. 1 illustrates a block diagram of a system in accordance with one 
embodiment of the present invention; 

[0011] FTG. 2 illustrates a flow diagram of a method in accordance with one 
embodiment of the present invention; and 

[0012] FIG. 3 illustrates a flow diagram of a subroutine of the method shown in 
FIG. 2. 

DETAILED DESCRIPTION 
[0013] The present invention comprises methods and systems for improving a search 
ranking by using population information. Reference will now be made in detail to 
exemplary embodiments of the invention as illustrated in the text and accompanying 
drawings. The same reference numbers are used throughout the drawings and the 
following description to refer to the same or like parts. 

[0014] Various systems in accordance with the present invention may be constructed. 
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FIG. 1 is a diagram illustrating an exemplary system in which exemplary embodiments of 
the present invention may operate. The present invention may operate in, and be 
embodied in, other systems as well. 

[0015] The system 100 shown in FIG. 1 includes multiple client devices 102a-n, a 
server device 104, and a network 106. The network 106 shown includes the Internet. In 
other embodiments, other networks, such as an intranet may be used. Moreover, methods 
according to the present invention may operate in a single computer. The client devices 
102a-n shown each include a computer-readable medium, such as a random access 
memory (RAM) 108, in the embodiment shown coupled to a processor 110. The 
processor 110 executes a set of computer-executable program instructions stored in 
memory 108. Such processors may include a microprocessor, an ASIC, and state 
machines. Such processors include, or may be in communication with, media, for 
example, computer-readable media, which stores instructions that, when executed by the 
processor, cause the processor to perform the steps described herein. Embodiments of 
computer-readable media include, but are not limited to, an electronic, optical, magnetic, 
or other storage or transmission device capable of providing a processor, such as the 
processor in communication with a touch-sensitive input device, with computer-readable 
instructions. Other examples of suitable media include, but are not limited to, a floppy 
disk, CD-ROM, magnetic disk, memory chip, ROM, RAM, an ASIC, a configured 
processor, all optical media, all magnetic tape or other magnetic media, or any other 
medium from which a computer processor can read instructions. Also various other 
forms of computer-readable media may transmit or carry instructions to a computer, 
including a router, private or public network, or other transmission device or channel, 
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both wired and wireless. The instructions may comprise code from any computer- 
programming language, including, for example, C, C++, C#, Visual Basic, Java, and 
JavaScript. 

[0016] Client devices 102a-n may also include a number of external or internal 
devices such as a mouse, a CD-ROM, a keyboard, a display, or other input or output 
devices. Examples of client devices 102a-n are personal computers, digital assistants, 
personal digital assistants, cellular phones, mobile phones, smart phones, pagers, digital 
tablets, laptop computers, a processor-based device and similar types of systems and 
devices. In general, a client device 102a-n may be any type of processor-based platform 
connected to a network 106 and that interacts with one or more application programs. 
The client devices 102a-n shown include personal computers executing a browser 
application program such as Internet Explorer™, version 6.0 from Microsoft 
Corporation; Netscape Navigator™, version 7.1 from Netscape Communications 
Corporation; and Safari™, version 1.0 from Apple Computer. 

[0017] Through the client devices 102a-n, users 112a-n can communicate over the 
network 106 with each other and with other systems and devices coupled to the network 
106. Users 112a-n can be located in different locations, countries, or regions. As shown 
in FIG. 1, a server device 104 is also coupled to the network 106. In the embodiment 
shown, a user 112a-n can generate a search query 114 at a client device 102a-n to 
transmit to the server device 104 via the network 106. For example, a user 112a in one 
country types a textual search query 114 into a query field of a web page displayed on the 
client device 102a. The client device 102a then transmits an associated search query 
signal 126 reflecting the search query 114 via the network 106 to the server device 104. 
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[0018] The server device 104 shown includes a server executing a search engine 
application program such as the Google™ search engine. Similar to the client devices 
102a-n, the server device 104 shown includes a processor 116 coupled to a computer 
readable memory 118. Server device 104 5 depicted as a single computer system, may be 
implemented as a network of computer processors. Examples of a server device 104 are 
servers, mainframe computers, networked computers, a processor-based device and 
similar types of systems and devices. Client processors 110 and the server processor 116 
can be any of a number of well-known computer processors, such as processors from 
Intel Corporation of Santa Clara, California; and Motorola Corporation of Schaumburg, 
Illinois. 

[0019] Memory 118 contains the search engine application program, also known as a 
search engine 124. The search engine 124 locates relevant information in response to a 
search query 114 from a user 112a-n. 

[0020] The server device 104, or related device, has previously performed a search of 
the network 106 to locate articles, such as web pages, stored at other devices or systems 
connected to the network 106, and indexed the articles in memory 118 or another data 
storage device. Articles include, documents, for example, web pages of various formats, 
such as HTML, XML, XHTML, Portable Document Format (PDF) files, and word 
processor, database, and application program document files, audio, video, or any other 
information of any type whatsoever made available on a network (such as the Internet), a 
personal computer, or other computing or storage means. The embodiments described 
herein are described generally in relation to documents, but embodiments may operate on 
any type of article. 
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[0021] The search engine 124 responds to the associated search query signal 126 
reflecting the search query 114 by returning a set of relevant information or search results 
132 to client device 102a-n from which the search query 114 originated. 

[0022] The search engine 124 shown includes a document locator 134, a ranking 
processor 136, and a population processor 138. In the embodiment shown, each 
comprises computer code residing in the memory 118. The document locator 134 
identifies a set of documents that are responsive to the search query 114 from a user 
112a. In the embodiment shown, this is accomplished by accessing an index of 
documents, indexed in accordance with potential search queries or search terms. The 
ranking processor 136 ranks or scores the search result 132 including the located set of 
web pages or documents based upon relevance to a search query 114 and/or any another 
criteria. The population processor 138 determines or otherwise measures a population 
signal such as a population signal 128 that reflects or otherwise corresponds to a 
population associated with a user 112a-n. Note that other functions and characteristics of 
the document locator 134, ranking processor 136, and population processor 138 are 
further described below. 

[0023] Server device 104 also provides access to other storage elements, such as a 
population data storage element, in the example shown a population database 120, and a 
selection data storage element, in the example shown, a selection data database 122. The 
specific selection database shown is a clickthrough database, but any selection data 
storage element may be used. Data storage elements may include any one or 
combination of methods for storing data, including without limitation, arrays, hashtables, 
lists, and pairs. Other similar types of data storage devices can be accessed by the server 

8 



Express Mail No. EL667552452 
Attorney Docket No. GP- 134-01 
PATENT 

device 104. The population database 120 stores population information associated with 
users 112a-n inputting search queries. Examples of population information associated 
with users 112a-n includes information about the locations of users 112a-n, information 
about the populations with which users 112a-n are associated, and information about 
groups with which users 112a-n are associated. 

[0024] Examples of locations of users can include, but are not limited to, a continent, 
a region, a country, a state, a county, or a city. By way of example, locations of users can 
be identified by country, such as France, Germany, Japan, and the United States. 

[0025] Examples of populations with which users are associated can include, but are 
not limited to, a gender, a demographic, an ethnicity, a continent, a region, a country, a 
state, a county, or a city. By way of example, populations with which users are 
associated with can be identified by age ranges of the user, such as "under 18 years old," 
"18-24 years old" "25-34 years old," "35-49 years old," "50-62 years old," and "over 62 
years old." 

[0026] Examples of groups with which users are associated, can include, but are not 
limited to, a gender, a demographic group, an ethnic group, persons with a shared 
characteristic, persons with a shared interest, and persons grouped by a predetermined 
selection. By way of example, groups with which users can be associated with can be 
identified as "all persons interested in collecting ancient shark teeth," and "all persons not 
interested in collecting ancient shark teeth." 

[0027] Population information can also include self identification-type data or 
automatic identification-type data. Self identification-type data includes, but is not 
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limited to, user registration data, user preference data, and other user selected data. By 
way of example, self-identification data is a language preference selection that a user 
inputs into a browser application program. Automatic identification-type data includes, 
but is not limited to, the Internet protocol address of a user's location, default data 
obtained from a user's browser application program, cookies, and other data collected 
from a user's application program when the user's application program interacts with a 
search engine. By way of example, automatic-identification data may comprise the 
domain of a user's network address on the Internet, or may be information stored in a 
"cookie" obtained by or accessed by a user's browser application program. 

[0028] The search engine 124 determines population information or otherwise 
executes a set of instructions to determine population information associated with users 
112a-n, and stores population-type information in the population database 120. 
Alternatively, the population processor 138 determines population information or 
otherwise executes a set of instructions to determine population information associated 
with users 112a-n, and stores population-type information in the population database 
120. 

[0029J It should be noted that the present invention may comprise systems having 
different architecture than that which is shown in FIG. 1. For example, in some systems 
according to the present invention, the population database 120 and processor 138 may 
not be part of the search engine 124, and may carry out modification of population data 
or other operations offline. Also, in other embodiments, the population processor 138 
may affect the output of the document locator 134 or other system. The system 100 
shown in FIG. 1 is merely exemplary, and is used to explain the exemplary methods 
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shown in FIGS. 2-3. 

[0030] In the embodiment shown, the population database 120 contains data gathered 
and stored prior to carrying out the example method of the present invention as shown in 
FIGS. 2-3. Still referring to FIG. 1, the population processor 138 shown records 
population information associated with users 112a-n by obtaining and analyzing the 
location of a user inputting a search query and selecting search results for the search 
query. For example, when a user 112a in Japan types in a first search query such as 
"boating," the population processor 138 determines that the user 112a is transmitting the 
search query from an Internet protocol address located in Japan, and is also using 
Japanese language preferences for the browser application program. Furthermore, when 
the user 112a selects particular search results returned by the search engine in response to 
the search query, the population processor 138 determines that the user 112a selects 
particular documents in the Japanese language. Thus, when a user 112a in Japan inputs 
the query "boating," search results relevant to the user 112a in Japan may be returned 
such as "boating.co.jp." Other types of population information can be determined by the 
invention and stored by the population database 120. 

[0031] By way of another example, the population database 120 can store 
information that a user is associated with sub-populations of a population. For example, 
the population processor 138 determines that a user 112a in Europe, a region in the 
world, is from Luxembourg, a country in Europe. The population database 120 can also 
store information that another user in Europe is from France, another country in Europe. 
Thus, while each user is associated with the population of "Europe," each user is 
associated with a respective sub-population, "Luxembourg" and "France." Populations 
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and sub-populations can include, but are not limited to, continents, regions, countries, 
states, counties, cities, genders, demographic groups, ethnic groups, languages spoken, 
universal resource locators, internet protocol addresses, domain names, internet service 
providers, groups, persons with shared characteristics, persons with shared interests, and 
persons grouped by a predetermined selection. Various levels of sub-populations can 
exist for a population. For example, "Parisians" are a sub-population to "France" which 
is a sub-population to "Europe" which is a sub-population to the "World." Subpopulation 
information can be useful if there is an insufficient number of user clicks from users from 
a particular location or population such as France. However, since France is a sub- 
population of "Europe," click information corresponding to users in "Europe" could be 
used to augment the click information for a query from the user in France. Generally, if 
click information for a sub-population is sparse or does not exist, information from a 
higher population level can be used to augment the click information. 

[0032] In determining population and sub-population information, the population 
processor 138 can also determine a weight for each type of information. For example, 
the population processor 138 can determine to weight that a user is from a particular 
region less than the weight for information that a user is from a particular country so that 
improved search results for subsequent search queries can be obtained from region and 
country information. Thus, information that a user is from a particular region (Europe) 
can be weighted less than information that a user is from a particular country (France). 
Other types of weighting or similar, population-type data can be defined by the invention 
and stored by the stored by a population database 120. 

[0033] The population database 120 shown includes a list of user locations for a 
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particular query. For example, for the search query "boating," population information 
such as the determined location of users who input the query "boating" are stored and 
associated with the search query "boating" The user's locations can be "France," 
"Japan," and the "United States." These locations are used for example purposes. In 
other embodiments, the number of locations can be greater or fewer, or other countries, 
locations, populations, or sub-populations can be used. 



[0034] An example of information stored in a population database implemented by 
various embodiments of the invention is as follows: 



Query 


Locations 


Qi 


Japan, France, United States 


Q2 


Europe, Asia, North America 



[0035] In the table shown above, the first column lists unique queries and the second 
column lists corresponding locations of users. Each query represents a search query input 
by a user. The corresponding locations of users represents the determined location of 
users who input the respective query. Thus, in for query "Ql" shown above, determined 
locations of users who previously input the query "Ql" includes "Japan," "France," and 
the "United States." When the table is implemented by the invention, the search engine 
124 can call upon the determined locations of users in the "Location" column for a 
particular query such as "Ql." When a new query is input by a user, the new query is 
inserted into the "Query" column. Likewise, when a location of a user is determined for 
the new query, the new location is inserted into the second column titled "Location." 
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[0036] The clickthrough database 122 shown stores data associated with users' 
selection of a search result 132 from a search engine 124, such as fi'om a list of 
documents located in response to a search query 114. For example, a user 112a enters an 
input at a client device 102a-n, such as manipulating a mouse or another input device to 
click on one or more URLs associated with web pages or documents of a search result 
132. A user "click" is generated by the user's selection of a document located in the 
search result 132. This "click" on a document is stored in the clickthrough database 122 
as a selection associated with the document's presence in a search result returned for a 
particular search query. Many other such selection-type data, associated with users' 
selections of documents in search results, are stored there as well. 

[0037] User clicks are referred to as "clickthrough" data. In the embodiment shown, 
the search engine 124 measures and stores the clickthrough data as well as other data 
related to each of the documents located in the search result 132. 

[0038] Clickthrough data is generally an indicator of quality in a search result. 
Quality signals or clickthrough data can include, but is not limited to, whether a particular 
URL or document is clicked by a particular user; how often a URL, document, or web 
page is clicked by one or more users; and how often a particular user clicks on specific 
documents or web pages. Other types of quality signals similar to clickthrough data, such 
as user inputs or observational type data, can be stored by a clickthrough database 122 or 
similar data storage devices. 

[0039] Other data related to documents located in a search result 132 that can be 
stored in a clickthrough database 122 or other data storage device can include, but is not 
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limited to, how often a particular URL, document, or web page is shown in response to a 
search query 114; how many times a particular search query 114 is asked by users 112a-n 
from a particular location; how many times a particular search query 114 is asked by 
users 112a-n from a particular population; how many times a particular document is 
selected by users 112a-n from a particular location, how many times a particular 
document is selected by users 112a-n from a particular population; how many times a 
particular document is by selected by users 112a-n for a particular search query 114; the 
age or time a particular document has been posted on a network 106, and identity of a 
source of a particular document on a network 106, 

[0040] Population information from the population database 120 and selection data 
from the selection database (shown as a clickthrough database 122) can be processed by 
the population processor 138 and stored for subsequent use. For example, the population 
processor 138 retrieves clickthrough data for a particular search query. The clickthrough 
data for the particular search query is apportioned based on users' locations. The search 
engine 124 calls to the population database 120 for location information for all users 
entering a particular search query and selecting documents for the search result for the 
query. If the population processor 138 determines that users from three locations, Japan, 
France, and the United States, submitted selection data for a particular query 114, a 
respective designation for each set of users from each location can be defined by the 
population processor 138. Thus, users from Japan can be designated as "J," users from 
France can be designated as "F," and users from the United States can be designated as 
"US " The population processor 138 then apportions the number of clicks collected by 
the clickthrough database 122 for the particular set of documents to each respective 
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designation based on user location. 

[0041] One example of information stored in a population database implemented by 
an embodiment of the invention is as follows: 



Query 


Document 


Total 
Number of 
All Clicks 


Japan 


France 


United States 


0 


Di 


101 


1 


20 


80 


Q 


D 2 


207 


2 


5 


200 














Q 


D, 


#(Q,D f ,A) 


#(QA,J) 


#(Q,Dj,F) 


#(Q,D is U) 














Q 


D N 


#(Q,D N ,A) 


#(Q,D N ,J) 


#(Q,D N ,F) 


#(Q,D N ,U) 


Q 


D total 


1500 


100 


300 


1100 



[0042] In the example provided above for the population database 120 shown, for the 
search query "Q," the total number of user clicks on document "Di" was "101." The total 
number of user clicks on document "Dj" by users located in Japan was "1," the total 
number of user clicks on document u Di" by users located in France was "20," and the 
total number of user clicks on document "Dj" by users located in the United States was 
"80." 

[0043] In the embodiment shown in FIG. 1, the server 104 is in communication with 

the population database 120 and the clickthrough database 122. The server 104 carries 

out a process by which the data in the two databases 120, 122 are used to improve the 

search results provided in response to a search query 114 from a user 112a. 
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[0044] Various methods in accordance with the present invention may be carried out. 
One exemplary method according to the present invention comprises receiving a search 
query, determining a population associated with the search query, determining an article 
(such as a web page) associated with the search query, and determining a ranking score 
for the article based at least in part on data associated with the population. This 
population information is thus used to impact the ranking score for the article. 
Preferably, this population information indicates behavior of the population determined 
(e.g., a group or sub-group) relative to the search query and/or the article. For example, 
this information may indicate the preferred articles selected by others in the same, 
similar, or related population in relation to the same, similar, or related query. 

[0045] A ranking score for a second article, a third article, a fourth article, etc., 
associated with the search query may also be determined based at least in part on 
information associated with the population. These articles may then be ranked against 
each other based on the ranking score and presented in a ranked order to the person 
submitting the search query. Preferably, this results in a ranking that provides the most 
relevant articles to the user first. 

[0046] The population associated with the query can be one or more of a variety of 
populations. Examples include, but are not limited to demographic data such as age, age 
range, sex, race, primary language, secondary language, location, income, income range, 
a continent, a region, a country, a state, a county, a city, a gender, an ethnic group, a 
group, persons with a shared characteristic, persons with a shared interest, persons 
grouped by a predetermined selection, and internet service provider data (or the likely or 
possible data for any of these). In other words, the population may be any group 
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determined by any characteristics selected. 

[0047] The population associated with the search query may be determined in one or 
more of a variety of ways. For example, demographic data associated with a sender of 
the search query may be determined in order to determine the population of interest, and 
this data can be any one or more of the above-mentioned populations or others. For 
example, a likely geographic location for the sender of the search query may be 
determined by identifying the Internet Protocol address from which the search query was 
sent, an address input by the sender to access a search engine, or demographic data input 
by the sender. The population associated with the search query may also be determined 
in other ways, such as determining demographic data associated with the search query. 
This may be accomplished by, for example, determining the language of the search query 
or determining data associated with previous senders of the search query. 

[0048| As can be seen, determining a population associated with the search query can 
comprise determining self-identification data, automatic-identification data, or other data 
or information associated with a user transmitting the search query, such as user 
registration data, user preference data, and user selected data. For example, in registering 
for membership or access to a web site, a user may input registration information. The 
user may express preferences during such a registration process (e.g., preferred language) 
or may express preferences in other ways (such as in selection of web pages or domain 
use). Automatic-identification data may comprise, for example, an IP address, a domain, 
default data obtained an application associated with the user, or other automatically- 
procured or automatically-provided information. There can be, of course, some overlap 
between self-identification data and automatic-identification data. 
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[0049] Data associated with the population determined may itself by determined in 
one or more of a variety of ways. For example, a selection score for the article in a 
context of the population may be determined. As an illustration, it may be determined 
that a certain number of members of the population associated with the search query at 
hand had previously clicked on the article at issue. This selection score may indicate the 
relative interest of members of the population in the article at issue. 

[0050] Again, there are a wide variety of data that can comprise the population 
information. Other examples include a number of members of the population, a number 
of members of the population that selected a result returned for the search query 
previously, a number of members of the population that input the search query, a number 
of members of the population to which search results for the search query were shown, a 
total selection score, and a total number of members of the population that selected the 
article. There are many other examples. 

[0051 ] In some instances, more than one populations associated with the search query 
may be determined and used in order to provide improved search results. In one 
embodiment, for example, a second population associated with the search query is 
determined, determining the ranking score for the article is further based at least in part 
on data associated with the second population. 

[0052] These and other aspects of embodiments of the present invention are 
described further herein. These exemplary aspects of embodiments of the present 
invention may be repeated or iterated to improve search results. Moreover, these and 
other steps taken in methods according to the present invention may be stored in the form 
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of program code in a computer-readable medium, such as memory associated with a 
processor, a disk, or other computer-readable medium. 

[0053] FIGS. 2-3 illustrate an exemplary method 200 in accordance with the present 
invention in detail This exemplary method is provided by way of example, as there are a 
variety of ways to carry out methods according to the present invention. The method 200 
shown in FIG. 2 can be executed or otherwise performed by any of various systems. The 
method 200 is described below as carried out by the system 100 shown in FIG. 1 by way 
of example, and various elements of the system 100 are referenced in explaining the 
example method of FIGs. 2-3. The method 200 shown provides an improvement of a 
search ranking using population information. 

[0054] Each block shown in FIGs. 2-3 represents one or more steps carried out in the 
exemplary method 200. Referring to FIG. 2, in block 202, the example method 200 
begins. Block 202 is followed by block 204, in which a population database 120 is 
provided. This may be accomplished by, for example, constructing such a database or 
establishing communication with such a database. As described with reference to FIG. 1, 
the population database 120 stores population-type information for documents selected in 
a search result for a search query 114 and other search queries. 

[0055] Block 204 is followed by block 206, in which a selection database, in this case 
a clickthrough database 122, is provided. This may be accomplished by, for example, 
constructing such a database or establishing communication with such a database. As 
described with reference to FIG. 1, the clickthrough database 122 stores data associated 
with users' clicks or inputs to a search result 132 provided by the search engine 124, such 
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as a list of documents, such as web pages, provided in response to a search query 114 
from a user 112a. 

[0056] Block 206 is followed by block 208, in which a search query, in the form of a 
search query signal is received by the server. In the embodiment shown, a user 112a 
generates a search query 114 at a client device 102a. The client device 102a transmits an 
associated search query signal 126 reflecting the search query 114 to the server device 
104 via a network 106. The search engine 124 receives the search query signal 126 and 
processes the search query 114. For example, if the user 112a types a search query 
"boating" into the search or query field of a search page on a browser application 
program, the client 102a transmits a search query signal 126 that includes the text 
"boating" or some other representation or indication of "boating." The search engine 124 
receives the signal 126 and determines that "boating" is the desired search query 114. 

[0057] Block 208 is followed by block 210, in which article data, in the case shown, 
document data, is determined and received. In this block 210 in the embodiment shown, 
the search engine 124 conducts a search for relevant documents in a search database (not 
shown) or memory 118 that have previously been indexed from the network 106. The 
search engine 124 receives document data from the search database or memory 118 in 
response to the search query signal 126 reflecting the search query 114 from the user 
112a. The document data is also referred to as the initial search result for the search 
query 114. Document data can include, but is not limited to, a universal resource locator 
(URL) that provides a link to a document, web page, or to a location from which a 
document or web page can be retrieved or otherwise accessed by the user 112a via the 
network 106. Note that document data is sometimes referred to as a "document" 
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throughout the text of the specification. Alternatively, the document locator 134 obtains 
or otherwise receives document data in response to a search query signal 126 reflecting a 
search query 114. 

[0058| For example, in block 210 shown, the search engine 124 shown would 
determine a list of documents responsive to the search query "boating." This list of 
documents would comprise the determined document data. An initial search result list 
for "boating" could comprise a list of 15 documents. In the embodiment shown, this 
initial determination of document data may be by means of a conventional search engine 
query and results return. 

[0059] Block 210 is followed by block 212, in which a population signal is 
determined for each article of interest. In the embodiment shown, the search engine 124 
generates a population signal 128 for each document of the initial search result list 
determined in block 210 using a population function. For example, a population signal 
may be determined for each of the 15 documents identified in the initial search result 
determined in block 210. 

[0060] The population signal indicates a rating or score for the article of interest, and 
this rating or score reflects the relative interest of those in the same population group as 
the searching user. For example, articles previously selected by users in the same 
population group as the querying user 112a when carrying out the same query "Q" as 
input by the user 112a may receive a higher score for a population signal than articles 
previously selected only by users in population groups of which the user is not a member. 
This is but one example, however, and many variables and permutations may be used. 
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This rating or score reflected in the population signal may be used alone or in 
combination with other scoring or rating signals to score or rank the document, and to 
rank and compare groups of documents to, for example, provide a search result for the 
query sent by the user. 

[0061 ] In the embodiment shown, the population signal is determined by a population 
signal function, The population signal function may comprise an algorithm for 
calculating the population signal based on one or more variables. The population signal 
function in the embodiment shown comprises a set of instructions processed by the 
population processor 138. The algorithm is stored in memory 118. 

[0062] Any one or more of a variety of population signal functions may be 
implemented by various embodiments of the invention. Examples of variables that may 
be included in a population signal function include, without limitation, one or more of the 
following: 

• a total selection score for an article "d" for query "q," ( e *§-> the total number 
of clicks by all users on document "d" returned in search results in the context 
of query "q" or the total number of members of a population that selected a 
document "d"); 

• a selected or calculated weight of relationship between the selection score for 
article "d" for a query "q" (e.g., the total number of clicks by all users in 
population "pop" on the document "d" returned in search results in the context 
of search query "q.")l 
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• a selection score for document "d" for query "q" in the context of members of 
a population "pop" (e.g., a number of clicks for a document by members of 
the population when the document is returned for a search for query "q" ); 

• a smoothing factor that reflects how much data is needed to trust a click signal 
(e.g., a factor that reflects reliability or trust in the number of clicks by all 
users on document "d" returned in search results for query "q"); 

• a selection score for document "d" (e.g. number of clicks on document "d") 
for query "q" in the context of all users regardless of a population; 

• a total selection score for a set of documents (e.g., the number of clicks on all 
documents returned for query "q") by all users regardless of population; 

• a selection score for a set of documents (e.g., the number of clicks on all 
documents returned for query "q") by all users in a population "pop"; 

• a total selection score for document "d" (e.g. number of clicks on document 
"d") by all users in a population "pop" for any query "q"; 

• a selection score for document "d" (e.g. number of clicks on document d") by 
all users regardless of population for any query "qi,"; 

• a number of times a query "q" was input by users in a population "pop"; 

• a number of times a query "q" was input by all users regardless of population; 

• a number of members of a population (e.g., a number of members of a 
population that input a particular search query, selected a result returned for a 
particular search query, or were shown search results for a particular search 
query); 
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• one or more other ranking factors or scores, based on population, the article 
under consideration, and/or other factors. 
[0063] There are a variety of other variables that may be included, and these are only 
examples. Moreover, these and other variables may be limited or defined by designated 
time period, a designated number of users, the number of users who are self-identified or 
automatically identified in a population, by those who input a query "Q," or by other 
limitations or refinements. Variables, limitations, definitions, or other data associated 
with population data are generally referred to as population information or population 
data. 

[0064] An example of a population signal function is as follows: 



#(qAP) + V* 



(1) S(q,dj) = 



#(q,dj) 



N 

i = l 



N 

i = l 



[0065] wherein "S(q, dj)" is a score calculated for document "j" for a search query 
u q," based upon the population information and clickthrough data for users of a particular 
population "P;" 

[0066] <c #(q, d j5 P)" is the number of times document "d" was clicked by users in 
population "pop" for search query "q;" 



[0067] >" is a smoothing factor that reflects how much data is needed to trust a click 
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signal such as the number of clicks by users of population "P" for a query "q;" 

[0068) "#(q> dj)" is the number of times document "d" was clicked by all users 
regardless of population for query "q;" 

|0069] "#(q, di)" is the total number of user clicks on document "i" for query "q;" and 

[0070] "# (q, dj, P)" is the total number of user clicks for all users of population "P" 
for document "i" for query "q." 

[0071] Another example of a population signal is as follows: 



(2) 



S(q,dj) = 




#(q,dj,P) + H 



i = l 



[0072] wherein "SH(q,di)" is the number of times document "i" was shown for query 



"q;" 



[0073] "SH(q,di,P)" is the number of times document "i" was shown for query "q' 
for population "P;" and 

[0074] the other variables are described with respect to example (1). 
[0075] Another example of a population signal is as follows: 
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#(q t dj,P) + pa 



(3) 



S(q,dj) = 



#(q,dj) 



N 
7 = 7 



TV 



[0076] wherein "/Xi" is a smoothing factor that reflects how much data is needed to 
trust a click signal such as the number of clicks by users of population "P" for a query 

"q; M 

[00771 ' W is a smoothing factor that reflects the how much data is needed to trust a 
click signal such as in the number of clicks by all users for a query "q;" and 

[0078] the other variables are described with respect to examples (1) and / or (2). 

[0079] For purposes of illustration, the algorithm from example (1) is embodied in 
the example method according to the present invention shown in FIGs. 2-3. Other 
algorithms besides the examples shown in (1), (2), and (3) may be used in accordance 
with the present invention, and algorithm (1) is provided to illustrate examples. Such 
other algorithms may contain some, all, or none of the variables shown in examples (1), 
(2), and (3). 

[0080] FIG. 3 illustrates an example of a subroutine 212 for carrying out the method 
200 shown in FIG. 2 in accordance with example (1). The subroutine 212 shown 
provides a population signal 128 for each document received in an initial search result 
132. [n other embodiments, the number of documents so analyzed may be limited to less 
than all documents received. An example of subroutine 212 is as follows. 
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[0081] Referring to FIG. 3, the example subroutine 212 begins at block 300. At 
block 300, a counter associated with the search engine 124 is set to a value such as "1." 
For example, the population processor 138 can set a variable "i" in an associated memory 
118 to an initial value of "1." The counter or variable "i" counts the number of 
documents that are processed by the subroutine 212, and the current value of "i" reflects 
which document in the list of documents in the document data is under operation. 

[0082J Block 300 is followed by block 302, in which another counter associated with 
the search engine 124 is set to a value such as "1 " For example, the population 
processor 138 can set a variable "j" in an associated memory 118 to an initial value of 
"1 The counter or variable "j" counts the number of documents that are processed by 
the subroutine 212, and the current value of "j" reflects which document in the list of 
documents in the document data is under operation. 

[0083] Block 302 is followed by block 304, in which a population is determined for 
querying user 112a. The search engine 124 determines a population, in this case "P." 
For example, the search engine 124 determines a population associated with the user 
inputting the search query "Q." As previously described, self-identification data or 
automatic-identification data, or a combination of both, can be utilized by the search 
engine 124 to determine a population associated with the user. In this embodiment, the 
search engine 124 determines from a user's Internet Protocol address that the user is 
likely from France, designated as population "F " therefore the search engine 124 cross- 
references population information of users from France that have previously selected 
document 'TV' for the query "Q." Upon determining a population T for the querying 
user, the search engine 124 can call to the population database 120 to obtain any 
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corresponding population information as needed. For example, the search engine 124 can 
retrieve data from the population database 120 and determine that a total of "20" users in 
France, population "F," have previously selected a particular document "Di" for the 
search query "Q." Therefore, population information for a particular population "P" can 
be applied by the search engine 124 to calculate a population signal for document "Di" in 
accordance with the present invention. 

[0084] Block 304 is followed by block 306, in which a number of documents for 
analysis is determined. In block 210, the server 104 received document data associated 
with the search query 114. Among the data determined was the total number of 
documents in the list of documents responsive to the search query 114. 

[0085] This number of documents is characterized by (and is set as) the variable "N " 
For example, as mentioned earlier, a search result for the search query "boating" could 
have 15 documents, and the server 104 would set "N" to a value of "15." 

[0086] Note that in alternative embodiments, any total number of documents for a 
search query that has been defined or otherwise stored by the population database 120 or 
another data storage device for a particular query can be transmitted to, or otherwise 
determined by the search engine 124 or population processor 138. Further note that the 
number of documents for each search result for a particular search query can depend 
upon the population-type information previously stored in the population database 120 as 
well as clickthrough data stored in the clickthrough database 122, or on other similar 
types of data stored in other data storage devices. 

[0087] Block 306 is followed by block 308, in which a variable "M" is determined. 
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The variable "M" reflects the number of documents for which a population signal is 
determined. In most instances, "M" will have the same value as the variable "N," or the 
number of documents determined in block 306 but it may be different. For example, as 
mentioned earlier, a search result for the search query "boating" could have 15 
documents, wherein the server set the variable N = 15, and the server 104 would also set 
"M" to a value of" 15." 

[0088] Block 308 is followed by block 310, in which a smoothing factor is 
determined. The search engine 124 determines a smoothing factor that reflects how 
much data is needed to trust a click signal such as user clicks from users from a particular 
population for a particular query. For example, the population processor 138 utilizes a 
predetermined equation or set of computer-executable instructions to determine the 
smoothing factor that accounts for how much data is needed to trust a click signal or the 
quality of the user clicks from all users and from users from a particular population for a 
particular query. The smoothing factor can be particularly useful if there are very few 
user clicks from users of a particular population or if user clicks from a particular source 
is known or otherwise perceived not to be reliable or otherwise trustworthy. In that case, 
the smoothing factor can be set to a constant value or a value that can otherwise influence 
the weight or value of the data associated with a particular population. In most instances, 
the smoothing factor is applied to a population signal function or to a set of computer- 
executable instructions processed by the population processor 138. 

|0089| As applied to an example, for the query "boating," the population processor 
138 determines a smoothing factor if there is an insufficient number of user clicks from 
users in France. Thus for the example in the table above, if query "Q" is "boating" and 
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20 user clicks from users in France is not a sufficient number of clicks to rely upon, then 
a smoothing factor is determined. The smoothing factor is represented by "/i" in the 
population signal function above in subroutine 212. This factor indicates the reliability or 
perceived trust in the number of clicks by users in France to the search query "boating." 

[0090] As applied to another example, for the query "cricket" by a user in France, the 
population processor 138 determines a smoothing factor if there is an insufficient number 
of user clicks from users in France. However, since France is a sub-population of 
"Europe," the population processor 138 can use click information corresponding to users 
in "Europe" to augment the click information for the query "cricket" from the user in 
France. Generally, if click information for a sub-population is sparse or does not exist, 
information from a higher population level can be used to augment the click information. 
In some instances, there may be additional levels of populations and sub-populations that 
could be used in this manner, i.e., "Parisians" are a sub-population to "France" which is a 
sub-population to "Europe" which is a sub-population to the "World." 

[0091] Note that in some instances, a general smoothing factor can be determined. 
The search engine 124 determines a general smoothing factor that reflects the reliability 
or trust in user clicks by all users for a particular document for a particular query. In the 
example shown, an assumption is made that the reliability or trust in user clicks from the 
local population is the same as the reliability or trust of user clicks from the general 
population, and the smoothing factor as determined above can be used as the general 
smoothing factor. 

[0092] By way of further example, a general smoothing factor can be determined as 
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follows. The population processor 138 accesses the population database 120 or other 
data storage device to retrieve user click data. Using a predetermined equation or set of 
computer-executable instructions, the population processor 138 determines the number of 
user clicks by all users for all documents in a search result for a particular query. The 
general smoothing factor can then be set as the number of clicks or selections by all users 
for all documents in a search result for a particular query. 

[0093] If smoothing factors, values, or scores for a particular document for a 
particular query have previously been stored in the population database 120, the 
population processor 138 retrieves the smoothing factors, values, or scores for a 
particular query. For example, the population database 120 may indicate that there is an 
insufficient number of clicks from all users for a document in a search result for the 
search query "boating," such as in the table above where only one user from Japan 
selected a document "Di" for the query "Q." In this instance, the total number of clicks 
from all users for all documents in the search result for the search query "boating" should 
be used. A general population smoothing factor as determined above can then be used in 
an algorithm to gradually transition between the two results. A determination can be 
made whether to use the number of clicks from all users for a document in a search result 
for the search query "boating," or the total number of clicks from all users for all 
documents in the search result for the search query "boating" so that improved search 
results can be obtained from population information and selection data. 

[0094] Block 310 is followed by block 312, in which a number of selections, in this 
case clicks, by a particular population "P" is determined for the current document of 
interest (document "i") for a particular query ("Q"). The search engine 124 determines 
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the number of clicks for document "i" when document "i" is returned in search results for 
query "Q." For example, the population processor 138 accesses population information 
stored by the population database 120 or other data storage devices as well as 
clickthrough data stored by the clickthrough database 122 or previously shared with the 
population database 120. The population processor 138 applies a predetermined equation 
or set of computer-executable instructions to some or all of the population information 
and clickthrough data to determine the number of clicks by a population on a document 
for a search query, also referred to as u #(q,dj,P) " For example, if the query "Q" is 
"boating," then it can be determined that there were 20 total user clicks by users in the 
France on document "Di" for query "Q." 

[0095] Block 312 is followed by block 314, in which a number of selections, in this 
case clicks, by all users is determined for the current document of interest (document "i") 
for a particular query ("Q"). The search engine 124 determines the number of clicks by 
all users for document "i" when document "i" is returned in search results for query "Q." 
For example, the population processor 138 accesses population information stored by the 
population database 120 or other data storage devices as well as clickthrough data stored 
by the clickthrough database 122 or previously shared with the population database 120. 
The population processor 138 applies a predetermined equation or set of computer- 
executable instructions to some or all of the population information and clickthrough data 
to determine the number of clicks by all users regardless of population on a document for 
a search query, also referred to as "#(q,dj)." For example, if the query "Q" is "boating," 
then it can be determined that there were 101 total user clicks by all users regardless of 
population on document "Di" for query "Q." 
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[0096] Block 314 is followed by decision block 316, in which a decision is made 
whether all documents for a search query have been processed. The search engine 124 
compares the counter or variable "i" initially set at a value of "1" in block 300 to the 
variable "N," which has been set to a value in block 306 according to the number of 
documents to be processed for the search query "Q." If not all the documents have been 
processed, then the counter or variable "i" will not equal the variable "N" and the "NO" 
branch is followed to block 318. In alternative embodiments, a maximum number of 
documents for analysis may be set. For example, "N" may be set to a maximum number 
that is less than the number of documents determined in block 306. 

[0097] In block 318, a counter is incremented to track the number of documents that 
have been processed. For example, the counter or variable "i" initially set at a value of 
"1" is incremented to a next value such as u 2." The subroutine 212 then returns to block 
312 to continue processing the next document. Subsequent documents are processed by 
blocks 312-314, and the counter or variable "i" at block 318 is subsequently incremented 
until all of the documents are processed, and the value of the counter or variable "i" 
equals "N." Thus, in the example provided previously for "boating," all 15 documents of 
the search result for the search query "boating" would be processed by blocks 312-314. 

[0098] When all of the documents have been processed, the "YES" branch is 
followed from decision block 318, and the subroutine 212 continues at block 320. 

[0099] At block 320, a sum of the number of selections, in this case clicks, by a 
particular population "P" for all documents (documents "i") for a particular query ("Q")> 
and a sum of the number of selections, in this case clicks, by all users for all documents 
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(documents "i") for a particular query ("Q") are determined. In the embodiment shown, 
the search engine 124 determines a sum of the number of selections, in this case clicks, 
by a particular population "P" for all documents (documents "i") for a particular query 
("Q") that describes the total number of clicks by the particular population "P" on 
documents associated with a search query "Q." For example, the population processor 
138 determines a sum which reflects some or all user clicks from a particular population 
"F" on documents returned in response to a prior search for a search query "Q." The sum 
can then be applied by the search engine 124 or population processor 138 to a population 
signal function or to set of computer-executable instructions. Note that the number of 
selections determined in block 312 is summed by block 320 for all documents "i." 

[0100] Furthermore, the search engine 124 determines a sum of the number of 
selections, in this case clicks, by all users for all documents (documents "i") for a 
particular query ("Q") that describes the total number of clicks by all users regardless or 
population on documents associated with a search query "Q." For example, the 
population processor 138 determines a sum which reflects some or all user clicks from all 
users regardless of population on documents returned in response to a prior search for a 
search query "Q" The sum can then be applied by the search engine 124 or population 
processor 138 to a population signal function or to set of computer-executable 
instructions. Note that the number of selections determined in block 314 is also summed 
by block 320 for all documents "i." 

[0101] Block 320 is followed by block 322, in which a smoothing factor is 
determined. The search engine 124 determines a smoothing factor that reflects how 
much data is needed to trust a click signal such as user clicks from users from a particular 
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population for a particular query. For example, the population processor 138 utilizes a 
predetermined equation or set of computer-executable instructions to determine the 
smoothing factor that accounts for how much data is needed to trust a click signal such as 
the user clicks from all users and from users from a particular population for a particular 
query. The smoothing factor can be particularly useful if there are very few user clicks 
from users of a particular population or if user clicks from a particular source is known or 
otherwise perceived not to be reliable or otherwise trustworthy. For example, as applied 
to the example for the query "Q" in the table above, since there is only a single click from 
users in Japan on document "Dl" for the query "Q " the smoothing factor V' can be set 
to a constant value such as "10." In this case, the smoothing factor can be used to affect 
or otherwise influence the weight or value of the data associated with a particular 
population. In most instances, the smoothing factor is applied to a population signal 
function or to a set of computer-executable instructions processed by the population 
processor 138. 

[0102] Note that in some instances, a general smoothing factor can be determined. 
The search engine 124 determines a general smoothing factor that reflects how much data 
is needed to trust a click signal such as user clicks by all users for a particular document 
for a particular query. In the example shown, an assumption is made that the amount of 
data needed to trust user clicks from the local population is the same as the amount of 
data needed to trust user clicks from the general population, and the smoothing factor as 
determined above can be used as the general smoothing factor. 

[0103] By way of further example, a general smoothing factor can be determined as 
follows. The population processor 138 accesses the population database 120 or other 
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data storage device to retrieve user click data. Using a predetermined equation or set of 
computer-executable instructions, the population processor 138 determines the number of 
user clicks by all for a particular document for a particular query. 

[0104] If smoothing factors, values, or scores for a particular document for a 
particular query have previously been stored in the population database 120, the 
population processor 138 retrieves the smoothing factors, values, or scores for a 
particular query. For example, the population database 120 may indicate that there is an 
insufficient number of clicks from all users for a document in a search result for the 
search query "boating," and that the total number of clicks from all users for all 
documents in the search result for the search query "boating" should be used. A general 
population smoothing factor can gradually transition between the two results. A 
determination can be made whether to use the number of clicks from all users for a 
document in a search result for the search query "boating," or the total number of clicks 
from all users for all documents in the search result for the search query "boating" so that 
improved search results can be obtained from population information and selection data. 

[0105] Block 322 is followed by block 324, in which a number of selections, in this 
case clicks, by all users for each document (document "j") returned in a search result for 
a particular query ("Q"). The search engine 124 determines the number of clicks for 
document "j" regardless of population when document "j" is returned in search results for 
query "Q." For example, the population processor 138 accesses population information 
stored by the population database 120 or other data storage devices as well as 
clickthrough data stored by the clickthrough database 122 or previously shared with the 
population database 120. The population processor 138 applies a predetermined equation 
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or set of computer-executable instructions to some or all of the population information 
and clickthrough data to determine the number of clicks by a population on a document 
for a search query, also referred to as u #(q,dj) " For example, if the query "Q" is 
"boating," then it can be determined that there were 101 total user clicks by all users on 
document "D," for query "Q " 

[0106] Block 324 is followed by block 326, in which a number of selections, in this 
case clicks, by all users in a population "P" is determined for each document (document 
"j") returned in a search result for a particular query ("Q"). The search engine 124 
determines the number of clicks by all users in a population "P" for document "j" when 
document "j" is returned in search results for query "Q." For example, the population 
processor 138 accesses population information stored by the population database 120 or 
other data storage devices as well as clickthrough data stored by the clickthrough 
database 122 or previously shared with the population database 120. The population 
processor 138 applies a predetermined equation or set of computer-executable 
instructions to some or all of the population information and clickthrough data to 
determine the number of clicks by a population on a document for a search query, also 
referred to as "#(q,dj,P) " For example, if the query "Q" is "boating," then it can be 
determined that there were 20 total user clicks by all users in the population of France on 
document "D|" for query "Q." 

[0107] Block 326 is followed by block 328, in which a population signal for a 
document for a particular search query is determined. The search engine 124 determines 
a population signal 128 for a particular document in a search result 132. For example, the 
population processor 138 uses a number of factors such as the number of times a search 
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query was asked by users of a particular population; the number of times a search query 
was asked by all users; the number of times a search query was asked and a particular 
document was clicked by users of a particular population; the number of times a search 
query was asked and a particular document was clicked by users of all populations; the 
number of times a document was clicked by users of a particular population for any 
search query; the number of times a document was clicked by all users for any search 
query; the smoothing factor, if needed; and the population click weight, if needed, to 
determine a population signal 128 for a particular document in a search result. 

[01081 In the embodiment shown, this population signal 128 is calculated using the 
data determined in previous blocks 300-324 discussed and the algorithm shown in 
example (1). As applied to the prior example for the query "boating," the population 
processor 138 determines a population signal 128 for a particular document in a search 
result 132. As represented by "S(Q,Dj)" in the population signal function as shown 
above in subroutine 212, a weighted value representing the weighted total number of user 
clicks on document "j" after counting clicks by users of a particular population "F" is 
determined by the population processor 138. This is carried out by performing the 
mathematical functions as indicated by the algorithm described above to calculate the 
Sj(Q,Dj)" for document "j " 

[0109] Block 328 is followed by decision block 330, in which a decision is made 
whether all documents for a search query have been processed. The search engine 124 
compares the counter or variable "j" initially set at a value of "1" in block 302 to the 
variable "M," which has been set to a value according to the number of documents to be 
processed for the search query. If all the documents have been processed, then the 
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counter or variable "j" will equal the variable "M" and the "YES" branch is followed to 
block 332. In alternative embodiments, a maximum number of documents for analysis 
may be set. For example, "M" may be set to a maximum number that is less than the 
number of documents determined in block 302. 

[01101 In block 332, the subroutine 212 ends. 

[0111] If however in decision block 330, not ail of the documents have been 
processed and the counter or variable "j" is not equal to the variable "M," then the "NO" 
branch is followed to block 334. 

[01121 In block 334, a counter is incremented to track the number of documents that 
have been processed. For example, the counter or variable "j" initially set at a value of 
"1" is incremented to a next value such as u 2." The subroutine 212 then returns to block 
324 to continue processing the next document. Subsequent documents are processed by 
blocks 324-328, and the counter or variable "j" at block 334 is subsequently incremented 
until all of the documents are processed, and the value of the counter or variable "j" 
equals U M " Thus, in the example provided previously for "boating," all 15 documents of 
the search result for the search query "boating" would be processed by blocks 324-328. 

[01131 When all of the documents have been processed, the "YES" branch is 
followed from decision block 332, and the subroutine 212 ends at block 332. 

[0114] Referring again to FIG. 2, subroutine 212 is followed by block 214, in which 
the population signal for each document is provided to the ranking processor. For 
example, in the embodiment shown, the calculated score "S(Q,Dj)" for each document u l 
- N" would be included in "N" population signals. The population signal 128 for each 
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document is transmitted to the ranking processor 136 for determining subsequent 
rankings or scores of search results in response to other search queries. The ranking 
processor 136 includes a ranking or scoring function or set of computer-executable 
instructions that incorporates the population signal 128 and/or other output from the 
population processor 138. For example, a weighted value generated from subroutine 212 
is transmitted to the ranking processor 136, which utilizes a population signal 128 such as 
a weighted value to rank or otherwise score subsequent search results. Other signals 130 
generated for each document by the search engine 124 or another system or method can 
also be transmitted to the ranking processor 136 to rank or score subsequent search 
results. 

[0115] Block 214 is followed by block 216, in which search results are provided. 
The ranking processor 136 generates a ranking or scoring of each document located in a 
search result 132 in response to a search query 114. Using the population signal 128 
from block 214, such as a weighted value, the ranking processor 136 affects the ranking 
or scoring of one or more documents located in a search result 132. Note that the ranking 
processor 136 can use other signals such as those shown in FIG. 1 as 130 in conjunction 
with the population signal 128 to rank or otherwise score documents of a search result 
132. In some instances, the ranking processor 136 can further decide whether to utilize a 
particular population signal 128 and/or other signals 130 during processing of a score or 
ranking for a search result 132. 

[0116] Block 216 is followed by block 218, in which the method 200 ends. 

[0117] In other embodiments of the invention, the method 200 can be utilized in an 



41 



Express Mail No. EL667552452 
Attorney Docket No. GP-1 34-01 
PATENT 

iterative manner to determine a new or updated population signal whenever new or 
changes to data in the population database 120 and/or clickthrough database 122 or other 
data storage devices is received or otherwise obtained. When a new or updated 
population signal is determined, the signal can then be transmitted to the ranking 
processor 136 to change or to update the ranking or scores for a search result 132. 

[0118] While the above description contains many specifics, these specifics should 
not be construed as limitations on the scope of the invention, but merely as 
exemplifications of the disclosed embodiments. Those skilled in the art will envision 
many other possible variations that are within the scope of the invention. 
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